An improved algorithm for feature selection using fractal dimension
نویسندگان
چکیده
Dimensionality reduction is an important issue in data mining and machine learning. Traina[1] proposed a feature selection algorithm to select the most important attributes for a given set of n-dimensional vectors based on correlation fractal dimension. The author used a kind of multi-dimensional “quad-tree” structure to compute the fractal dimension. Inspired by his work, we propose a new and simpler algorithm to compute the fractal dimension, and design a novel and faster feature selection algorithm using correlation fractal dimension, whose time complexity is lower than that of Traina’s. The main idea is when computing the fractal dimension of (d-1)-dimensional data, the intermediate generated results of the extended d-dimensional data is reused. It inherits the desirable properties described as in [1]. Also, Our algorithm does not require the grid sizes decrease by half as the original “quad-tree” algorithm. Experiments show our feature selection algorithm has a good efficiency over the test dataset.
منابع مشابه
An Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification
The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...
متن کاملAn Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification
The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...
متن کاملAn Improved Flower Pollination Algorithm with AdaBoost Algorithm for Feature Selection in Text Documents Classification
In recent years, production of text documents has seen an exponential growth, which is the reason why their proper classification seems necessary for better access. One of the main problems of classifying text documents is working in high-dimensional feature space. Feature Selection (FS) is one of the ways to reduce the number of text attributes. So, working with a great bulk of the feature spa...
متن کاملAn Improved Flower Pollination Algorithm with AdaBoost Algorithm for Feature Selection in Text Documents Classification
In recent years, production of text documents has seen an exponential growth, which is the reason why their proper classification seems necessary for better access. One of the main problems of classifying text documents is working in high-dimensional feature space. Feature Selection (FS) is one of the ways to reduce the number of text attributes. So, working with a great bulk of the feature spa...
متن کاملارائه یک روش برچسب گذاری سیگنالهای مغزی بهمنظور طبقهبندی حالتهای مختلف بیهوشی
Aims and background: This study develops a computational framework for the classification of different anesthesia states, including awake, moderate anesthesia, and general anesthesia, using electroencephalography (EEG) signals and peripheral parameters. Materials and Methods: The proposed method proposes ...
متن کامل